DOCOHENT BESUHE 

ED 084 889 EM Oil 713 



AUTHOR 
TITLE 

INSTITUTION 
PUB DATE 
NOTE 



EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



Smithy John B. 

An Advanced Sequence of Computer Courses for 
Humanities Students: The Penn State Program • 
Pennsylvania State Univ,^ University Park. 
Aug 73 

14p,; Paper presented at the Conference on Computers 
in the Undergraduate Curricula (Claremont, 
California, June 18-20, 1973) 

MF-$0.65 HC~$3.29 

♦Computer Assisted Instruction; Computer Programs; 
♦Computers; Computer Science; ♦Computer Science 
Education; ♦Humanities; *Humanities Instruction; 
Problem Solving; Program Descriptions; Programing 
Languages; Undergraduate Study 

Natural Language; Pennsylvania State University; PLI 
Programing Language 



ABSTRACT 

A series of computer science courses at Pennsylvania 
State University is designed to meet the needs of undergraduate 
humanities students who wish to use computers. The first of three 
integrated courses exposes the student to the range of computer 
applications in the humanities and teaches him to write nontrivial 
programs in the PL/1 Programiny Language, Instruction is arranged 
around programing problems; students survey the literature of 
computer applications and do a design project. The second course 
concentrates upon teaching students how to break complex tasks into 
their components. Natural language use is stressed and additional 
technical information is presented, including matters such as job 
control language and system utilities. In the third course the 
student solves a complex problem. He develops a thesis, translates it 
into operational terms and computational procedures, performs an 
analysis^ interprets the results, and maps the results back to the 
level from which he began. The sequence has been judged successful 
since it teaches students both the general techniques of 
problem-solving and, more specifically, creative, substantive ways to 
use the computer in the numanities. (PB) 



FILMED FROM BEST AVAILABLE COPY 



1 



OO 

CO 

co 
o 



An Advanced Sequence of Computer Courses for Humanities Students: The 



\jU Penn State Program 



By John B. Smith 

Department of English 
Pennsylvania State University 
University Park, Pa. 16802 
Home: 814-238-2767 
Offices: -865-9681 
-865-0438 



US DEPARTMENT OFHEALTH. 
EDUCATION & WELFARE 
NATIONAL INSTITUTE OF 
EDUCATION 

THIS DCCUMENT H. S_BE EN R E PRO 



nucEo exactL; as received from 

?HP PERSON OR ORGANIZATION OR»&iN 
IVfNG'lT'pOINTSOrV.E-VOROP.N|^3NS 
ATED DO NOT NEC E SSAR IL Y « E PR E 
NT OFFICIAL NATIONAL INS^ ITUTE OF 



STA 

EDUCATION POSITION OR POLICY 



CO 

r 

o 




Introduction 

During the past five or six years the growth of special courses in com^ 
puter techniques for humanities students has been dramatic. The September, 
1972, Computers and the Humanities indicates that there are at least forty 
institutions over the country offering introductory courses. Since the 
survey was conducted by voluntarily submitted forms, this number is prob- 
ably conservative. This indication of awareness of the necessity for 
courses sensitive to the needs and intellectual habits of humanislrs is 
most encouraging; however, if humanistic studies involvinp, the computer 
are to move from the level of the mechanical to the creative, if tlu) com- 
puter is ever to become an active, easily and naturally applied tool ai: the 
undergraduate level, colleges and universities must offer more advanced 
courses, perhaps concentrating on specific subject areas. Of the ten uni- 
versities listed in Computers and the Humanities who offer more than one 
course for humanities students only five offer advanced courses that build 
on and extend the introductory material. In the remarks that follow, I 
shall try to show not only the necessity of advanced courses, but the nec- 
essity of a tightly integrated sequence of courses each building on the 
preceding and leading to the course that follows. Since I have proposed 
and taught at Penn State a program of this sort I shall cast my remarks in 
the context of this experience. 

RaL Lonale 1 (Jr Sticjuence 

Any rationale for a sequence of special courses for humanists must be 
based on the realization that their mode or habits of thought differ i'rom 

those of their counterparts in the sciences, business, and agriculture. 

\ 

\ 



Smith — 2 



Too often, particularly among computer scientists, humanities students are 
regarded simply as slow if not stupid when it comes to learning computer 
programming. Much has been written about "the two cultures" or differences 
in quantitative and verbal reasoning to warrant comment here; a more criti- 
cal factor has gone relatively unnoticed. Science students and students in 
the humanities often differ considerably in their tendency to move from the 
specific to the general and, more importantly, their sense of comfort with 
situations in which such jumps are impossible. For example, the student of 
literature is consistently asked to observe in his readings small, subtle 
patterns among words and ideas and then to extend these observati on.s , quickly 
and too often imprecisely, to generate abstract theses conccrninj; viioiu 
works, canons, and, sometimes, entire literary traditions. If thu student 
works in this manner long enough this mode of thought becomes habitual, if 
not reflexive. A mathematics undergraduate, on the other hand, may find 
such jumps quite foreign. For example, a student studying modern algebra 
begins with a set of axioms and then slowly and deliberately constructs a 
mathematical system. He must, of course, have some overview of the direction 
in which his efforts are going; but most of his time is spent focused on 
specific problems. Few would argue that one mode of thouglit is inlierently 
"better" than the other, but recognition of their differences leads to dif- 
ferent approaches for teaching computing to these two types of individuals. 
Because of tlie large volume of unfamiliar detail, the introductory course 
must be structured in a way tliat gives the liumanitics stutient a s< nsc of 
continuity and total organization. 

For the physical or social science major, the computer fits closely 
familiar research designs so that once he knows how to use the computer he 



Smith— 3 



knows what to use it for. This is not true for the humanist. The work he 
does in his courses and research projecus often involves high levels of 
abstraction and subjectxvity . Once he knows how to program, he may be 
able to see the usefulness of a concordance, a word index, a reverse 
spelling dictionary, or a list of relative frequencies among words; but 
he is unlikely to see right away the more interesting and creative uses 
to which the computer can be applied. For example, it |is unlikely that 
he will see on his own the application of Fourier analysis for thematic 
studies because it is quite unlikely that he is familiar with this model 
if, in fact, he has ever heard of it. Even if presented with distribution 
graphs of themes, the unfamiliarity of thinking visually or spatially about 
a piece of literature or music necessitates considerably more cxpo^iurc to 
and instruction in these new analytic tools for the humanities student than 
for others. He must have time in which to assimilate these models and 
methods in addition to learning rudimentary programming skills. Only then 
will he be able to make the broad conceptual jump from mechanical skills 
to humanistic interest that will result in intellectually developmental 
uses of the computer. In addition to time, this takes an integrated pro- 
gram that can connect the details cf programming and control langua^;es with 
substantive applications in the student's field. Tlie sequence of courses 
described below represents an attempt to establish such a curriculum. 

— ' urj;o 

The fir.st cuursi; has two major objectives: i.o develop Lh<.! .s LuucuU. * ;i 
competence to write nontrivial programs in PL/1 and to expose him to a broad 
range of humanities applications. In teaching programming, one must iiave 
some sense of the culture shock that hits many humanities students when 



Smith — A 



faced with the overwhelming detail associated with the computer. The situ- 
ation is analogous to what someone from Appalachia would experience if sud- 
denly placed in the middle of Manhattan and told to function. Much of what 
one needs to know to get along is unconscious; consequently if he had an 
experienced "instructor" at his side it would be impossible to anticipate 
all of the novitiate's needs and impossible for Uhe novitia.te to remember 
the information if it were given. With encouragement, however, the student 
will survive if he endures and works hard at learning the intricacies of 
his new environment. 

Translated into an introductory couise in computing, tiic instructor 
must empathize with the discomfort the humanities sUudcnU experiences be- 
cause of his need to grasp the structure of the whole, Altliourji tliis ncc:d 
cannot be met in the first few weeks, there are ways of orgnni/-ing the mat- 
erial that make it ^easier for the humanities student. It is most important 
to provide some larger context in which the details of a prograrnmin^; lan- 
guage can be placed. That context used in most introductory courses taugnu 
by computer scientists is the logical structure of the pro^;ramming lanj;ua?:e 
itself. When they cover I/O they are likely to discuss a number of the 
various options and control words in the language, comparing; and contrastin 
their various features. Similarly, if discussing looping they -^^^cn intro- 
duce the DO statement with all of its optional features. For the science 
student who has a sense of where .ill of l:lu*s is Icadinp, anci how i i: i.i.iy * 
ovtMiLually 1)L» app.Ufii lo a rr'aL prohlrm, I lir approacli is l iiu.*.; Iml. l(jr a 
great many humanities students, it is disasterous. 

A more reasonable approach for the humanities student is to orj;anixe 
the material around a sequence of programming problems with which he can 



Smith — 5 



identify. If the problems make sense to him, it is much easier for him to 
learn the concepts necessary to solve them. Since most humanists share an 
interest in verbal materials, I use some four or five problems related to 
text processing. The first assignment is to write a program to read in 
several lines of text and print them out. Next they read in the text, ex- 
tract the words, and print them, one word per line. Then they read in the 
text, extract the words, sort them into alphabetical order, and print them. 
The fourth exercise is a binary search for particular words in a sorted 
list. By the time the students have completed these four problems Lhcy 
will have used most of the basic features of the langua^^e. It is a source 
of the greatest comfort to these students to know thaL il: Cakes abouL 
twenty new control words or concepts for the first problem, Lcn acidiLional 
for the second, five for the third, and only two or tliree for the fourLh. 

As fewer syntactic forms are introduced, more and more cmphar^is is 
placed on algorithmic principles. For example, when the concept of a :iub- 
routine is introduced with the binary search, it represents a rcJ i c \ from 
keypunching all those cards again as opposed to any really strange new 
principle thaL must be grappled with. These four exercises consume four 
or five weeks of the term; having survived this, the student is now ready 
for a relatively quick overview of the language that introduces additional 
features and places them in context with those they already know. By this 
time the student has at least a rudimentary knowledge of rK)w to use Liic 
coiupuler . 

Giving the student a sense of what to use it for is nuicii more subtle, 
and takes much longer (in fact, this is the major focus of tlie two succeedii 
courses). A starts is made by having the students read a great deal of out- 
side material describing actual ap'^)lications in the humanities. To assist 



Smith — 6 



this process, I make available tc them a bibliography of some eight thou- 
sand titles I have accumulated filong with a retrieval program to access it. 
During the second half of the course I also lecture more and more on appli- 
cations showing as closely as possible the actual computational steps in- 
volved. 

The major programming effort of the second half of the course concerns 
T term project of the student's design. This experience gives them a chance 
to integrate the varioufi facets of the course and, indeed, a good deal of 
their undergraduate curriculum. Typical of the projects produced at tiiis 
level are one line poetry generators, random number art, and thematic 
studies of short stories. 

The first time this course was taught, we began witli Cwcnty-six stu- 
dents; twenty-three finished. Of the twenty-three that finished, aigliteen 
took the second course along with several auditors. This record, wc felt, 
strongly indicated the validity of the course's underlying rationale. 

Second Course 

At the end of the first course students have a basic knowledj;e of i'L/l 
and have been exposed to a variety of applications in the humanities. Stu- 
dents who stop formally studying the computer at this point but who v;:ifih Co 
apply it to their interests often find themselves in an awkward posilion. 
Having the competence to write an exercise-type proj;ram and Lhe intcl] t cLuai 
oxpojuirc to aC 1c'ar;L solium ^; I i mp.st; n{~ l.hr*. i)ol;(Mii: i .i 1 pO'v"T ol liii- nj 1 1 : r , 

lUvy oflun aue ovcr-opLiniiis t ic in Llic i.aslcs Llicy assume and I' liid LiH iii;.t i vf/s 
embroiled in unrealistic and inefficient programming projects. The frustra- 
tion that results is sometimes enough to undo the previous semester's work 
and turn them off from computing completely* The second course attempts to 



Suith — 7 



avoid these difficulties by considering the way complex tasks can be broken 
into individual programs or job steps. Thus, it focuses at a higher level, 
considering the individual program in the context of the larger resource 
environment of the system. Because of the backgrounds and need of the 
humanities students enrolled, this second course also emphasizes natural 
language techniques although I am careful to emphasize the generality of 
the principles involved. 

Text processing is viewed from a systems analytic po-rspec tive. The 
process of analyzing natural language is broken down into the following, 
sequential steps: encoding, scanning, sorting, file-structuring, <ind 
accessing. By realizing that what one gets out at the far end is di Lcr- 
mined by what one puts in at the near end, the student is able to resolve 
many of the seemingly arbitrary decisions that plague the novice. For 
example, if one wishes to consider segmental lengtli distributions, he had 
better include punctuation marks when he types or punches the text. Tlic 
student can thus establish principles that guide him without re.strictin;; 
himself to arbitrary conventions that make specific tasks awkward. 

The power of this approach is particularly evident in tiie matter of 
data structuring. By classifying the kinds of tasks performed in language 
analysis and the particular data organization they require, I am able, to 
show inf erentially the necessity for a random accessible file structure 
that makes each occurrence of each word available alon;^, witii its c^.»niplc'Lc» 
L:()nLL»::t. (Soi^ John i\. Siuilli, "KAT:;: A ri i I.e-I.< 'vr I 'I'l-xi III. i I i i.y SyM(i,i," 
Cllum 6,5 (May, 1972), 277-283 for a detailed discussion of this text sy:item. 
Having thus classified problems and noted the utility and restrictions of 
various computational approaches, the student is better qualified to ana- 
lyze subsequent problems and anticipate possible difficulties, preparing 



Smith— 8 



him to use the computer in honors-type projects or, possibly, in later grad- 
uate work. 

The second course also contains much additional technical information. 
To combine various program modules to perform a sequence of operations de- 
mands familiarity with, in this case, the IBM Job Control Language, A 
thorough introduction to JCL, including readings in the technical manuals, 
is given. Students are assigned several exercises involving tape and disk 
data sets » 

It is at this level that the student is introduced to some of Lhc sys- 
tem utilities. Because of the extensive use of sorting techniques, consid- 
eration and exercises involving both the JCL invoked SORT as well as a PL/1 
link-edited sort are given. Other utilities, including ILiiGI'.NKR, are dis- 
cussed. 

Additional PL/1 concepts are introduced in conjunction with Llie JCi. 
exercises. Prominent among these are record I/O features, regional data 
sets, and the PL/1 Ixst-processing features — items omitted or discussed 
only briefly in the introductory course. Some attention is also given to 
programming efficiency including timing tests to determine the relative 
efficiency of various blocks of code that perform the same operation. As 
before, the student is required to report his outside readings in applica- 
tions and to present a term project of his own design. 

Coinj-)lct ion of tlie second course nvirks l:lie end of lIk.^. i ii's( pji.i.'.e ot 
[.\\r |» ro};r.Jiii . I>y (hi:; I. i iiu^ iIm* .sl.utlcul h;i;; (trt*;* )<•;.. irir in-yoml i iu* i'Vii 
oi programming competence achieved in a sinj;le introductory course, ile has 
learned JCL and is familiar with a fair number of system utilities. ]{e has 
seen the way programs of his own design and utility programs can be fitted 
together to produce a rather sophisticated data handling system. By ' 



Smith--9 



considering problems systematically, he can see how seemingly inconsequen- 
tial details at one point determine later capabilities. Having completed 
this sequence of courses, the student has the technical background to apply 
the computer to problems of his own design and interests. 

Third Course 

The third course marks a second phase of the program — an opportunity 
for the student to apply the techniques and information he has gained to an 
extended project of his own design. The projects done in this sciiiinar are 
suitable for undergraduate honors projects or, at the graduate icivel, trial 
runs for masters and doctoral theses. The student is asked to follow a pre- 
scribed method and report on his work at two different stages in pro^;ress 
as well as make a final presentation. 

The method prescribed is based upon the assumption that the humanist 
must use the computer to explore the ideas and problems that interest iiim 
and his colleagues, not those problems that may be raised simply because 
the computer can solve them. Studies tliat result in frequency counts or 
various ratios in the absence of a supporting context or rationale have 
largely been ignored or dismissed in the humanities. Consequently, the 
first step is to define and justify a thesis wiuhin the traditional terms 
and methods of the student's discipline. At this level, the thesirj ar; ument: 
nd[;ht read like that found in a conventional term paper or dissertation, 
fjuito liki*ly oiniLtinj; any rt'Tc i'tMic to l.in.^ compnLi:!' .li a.i I , ii* ;.! Lii'- iN-fin- 
itions and methods of the thesis must be translated into operative terms 
and computational procedures. For example, an image might be defined on 
the first level as any word having sensory or thematic value, but wiien 



Sm:^.h~10. 



translated into computational terms » it becomes a parLitiori in;; of the vocabu- 
lary into those words selected as images and those not. Often tiie mapping 
onto the operative level is not one-to*-one; for example, I have seen studies 
that consider the associative relations among textual elements using a num- 
ber of analytic programs and aids, including frequency counts by text sec- 
tion> concordances, and principal component analysis. 

Having defined the thesis in conventional terms and translated it into 
operative terms and procedures, the student then performs liis analysis. 
When all procedures have been run, the student must interpret his results, 
but on the operative level. That is, if using factor analysis, he must 
determine the significant factors, their loadings, and their adequacy. if 
using Fourier analysis for, a thematic study, the student must t'ctcrniine the 
significant frequencies, produce cumulative plots, and then seek inter-rela- 
tion among the various thematic distributions. When he has analyzed all of 
his data in terms of the models and procedures in which they are defined, 
he is ready to move to the final step. 

The last step involves mapping his results back out to the levul from 
which he started* The student must now present an argument supportin;; and 
demonstrating his thesis that is once more understandable to his non-com- 
putational colleagues. To be sure, computer produced materials and data 
will be used in support of the argument, but the main presentation niust be 
cr^t in terms as close to tho traditional ones a:; jo-.-s i ble . 

SLudiMits rrporL un their pr()/|rt!t.s al. Ihre*' d i I ) (M<miI. All r - 

port on the initial thesis statement and all make final presentations. At 
some intenr^ediate stage, students report on the mapping process, the oper- 
ative procedures being used, and anticipated directions. It is my experience 



Smith— 11 



that projects executed in this way display a rigor of thought and argamcnt 
that is uncommon in much student work, particularly in the humanities. Of 
the students enrolled in my seminar on Computational Stylistics, at least 
half of their projects should result in publishable work. 

This last poinc raises a question implicit in the design of this en- 
tire course — the value of research training at the undergraduate level. 
The argument usually raised against training under^jraduates to do research 
is that many have no plans to go to graduate school. This argument, in my 
opinion, ignores the value involved in Lhe training experience iLscir. Tn 
this course and this program students arc asked to assimilate and npi^iy .i 
wide variety of skills and information, much of it derived from othur ptirts 
of their educational experience, to an extended problem that intcrc:. ts thcni. 
The problem-solving techniques developed are enough to justify tlic endcMvor, 
for whether the student ever uses a computer again or not, lie vill be nvich 
better prepared to approach any complex problem, break it into i^malicr 
problems, and to assemble a solution. Finally, as no other, this course 
gives the undergraduate an opportunity to integrate a large number of the 
different facets of his curriculum. The experience of actually usin;; lii:i 
training in literature, history, math, physics, and computer techniques on 
a single project of his design is often an exhilerating experience. Wiiat 
more can we offer any student? 

Coj u: lus i on 

Other C()iJ.e);eii and uii i ver:i i ties iin[jlei»ent in); e.()iiij)iit i r coursi';. lOr 
humanists should give careful consideration to tiie department or colic;; ; 
in which they are to be located. In the past, computer science dcparL: Mits 
have unquestionably contained the personnel best qualified to teach computin 



Smith-- 



As Computer Science becomes more and more a recognized discipline, the 
amount of attention and prestige associated with ''academic" courses as 
opposed to service courses has changed considerably. Today, the computer 
scientitit who puts the necessary effort into a course for humanists, 
learning all of the specialized techniques and uantive applications 

essential to teach it properly, is likely to be ^.cnalized within his pro- 
fession; he simply will not be taken seriously by his colleagues. The 
major argument in favor of locating such courses in many Computer Science 
departments seems to be territorial rather than pedap^ogical: regardless 
of interests, no department wishes to lose students or credit hours thai: 
suppvrt graduate programs. Ultimately, however, those of us deeply con- 
mitted to seeing the computer become in rei-lity the Uool iu promises to be 
for the humanities must insist that pedagogical concerns receive hiphciiL 
priority; otherwise the program will be stifled. As more and more human- 
ists complete course sequences of this sort and continue to work in the 
field, we can look to them for teachers. They are much more likely to hav 
the enthusiasm and dedication needed to teach other humanists than their 
colleagues in Computer Science. The best of two worlds, however, mi\y be 
achieved through joint appointments of such individuals and crosslisted 
courses, thereby insuring pedagogical priorities while distributing credit 
hours to those most concerned about them. 

This sequence of courses has been tau)!;ht once under nn cxpor 1 innntnl 
LibiM'al Ai't.s ruhric, and I: he fir;; I; iwo coursr.s wiiJ he i*cp<\il cti (iurni;; tlu' 
'73 winter and spring terms. We are currently negotiating; perm.uient list- 
ings. The program has been closely evaluated and is, we believe, a coniple 
success.. The students emerge thoroughly grounded in the programming lan- 
guage as well as ♦the system resources and control language; they are able 



Smith — 13 



to approach a complex problem, break it into modules, and anticipate the 
power and limitations imposed by various decisions. Finally, they have 
been taken step-by-step through an extended research project. To the best 
of my knowledge, this program because of its integrated, sequentiaJ organ- 
ization is the most thorough in the country. Obviously it will chanp,e, but 
the rationale behind it and the experience gained in teaching it may be 
useful to others in establishing their own programs to fit the needs of 
their students. 



